-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: HDFStore failures on timezone-aware data #20595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #20595 +/- ##
==========================================
+ Coverage 91.82% 91.82% +<.01%
==========================================
Files 153 153
Lines 49256 49259 +3
==========================================
+ Hits 45229 45232 +3
Misses 4027 4027
Continue to review full report at Codecov.
|
pandas/tests/io/test_pytables.py
Outdated
@@ -2793,10 +2793,20 @@ def test_empty_series_frame(self): | |||
self._check_roundtrip(df2, tm.assert_frame_equal) | |||
|
|||
def test_empty_series(self): | |||
for dtype in [np.int64, np.float64, np.object, 'm8[ns]', 'M8[ns]']: | |||
for dtype in [np.int64, np.float64, np.object, 'm8[ns]', 'M8[ns]', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use @pytest.mark.parametrize
to parametrize this test over dtype
instead of using a for
loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
doc/source/whatsnew/v0.23.0.txt
Outdated
@@ -1099,6 +1099,7 @@ I/O | |||
- Bug in :meth:`pandas.io.json.json_normalize` where subrecords are not properly normalized if any subrecords values are NoneType (:issue:`20030`) | |||
- Bug in ``usecols`` parameter in :func:`pandas.io.read_csv` and :func:`pandas.io.read_table` where error is not raised correctly when passing a string. (:issue:`20529`) | |||
- Bug in :func:`HDFStore.keys` when reading a file with a softlink causes exception (:issue:`20523`) | |||
- Bug in :class:`HDFStore` for Series and empty DataFrames with timezone-aware data (:issue:`20594`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not very clear, is this for Series generally and empty DataFrames? this is only for a fixed store.
@@ -2771,7 +2765,11 @@ def read(self, **kwargs): | |||
def write(self, obj, **kwargs): | |||
super(SeriesFixed, self).write(obj, **kwargs) | |||
self.write_index('index', obj.index) | |||
self.write_array('values', obj.values) | |||
if is_datetime64tz_dtype(obj.dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is reaching into the internal implementation way too much.
you can use obj.get_values()
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with obj.get_values()
is that it returns an array
with dtype datetime64[ns]
, but we need a DatetimeIndex
to preserve the timezone information. How about obj.dt._get_values()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so write_array
already handles the objects corrctly, try just passing obj
into that (you may need to add a case in write_array, but see where this gets you
pandas/tests/io/test_pytables.py
Outdated
self._check_roundtrip(s, tm.assert_series_equal) | ||
@pytest.mark.parametrize('dtype', [ | ||
np.int64, np.float64, np.object, 'm8[ns]', 'M8[ns]', | ||
'datetime64[ns, UTC]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add US/Eastern as well
pandas/tests/io/test_pytables.py
Outdated
s = Series(dtype=dtype) | ||
self._check_roundtrip(s, tm.assert_series_equal) | ||
|
||
def test_series_timezone(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parameterize these both with US/Eastern as well
can you update to last comments |
can you rebase and respond to comments |
closing as stale. would accept if rebased and comments addressed. |
git diff upstream/master -u -- "*.py" | flake8 --diff